-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Python: Start using the shared data flow libraries #3701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Just a few quick thoughts (as you've somewhat preempted the issue I'm writing 🙂):
|
Go needs a separate sync mechanism because it's in a separate repo. For python simply using the |
You can use whatever path prefix you want - you don't have to use the |
Ah, I have yet to discover |
Perhaps a not-python-specific version of this could go into the shared implementation.
|
@aschackmull thanks for the meeting. It has resulted in |
also restore accidentally removed comment
|
I find all of this very interesting, so if you can, keep the documentation as detailed as it is now :) |
|
@intrigus-lgtm Do you mean |
need to actually have `OutNode`s
Yeah, |
also more grouping
Also, fix test for local flow
but I think they receive no flow
Ok, I am afraid I have made some messy progress lately :-) I will try to add my more stable findings to the readme so there will be some sort of thread.. |
between SSA variables and control flow nodes
python/ql/src/experimental/dataflow/internal/DataFlowPublic.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
|
As part of my attempt at implementing field flow, I did some refactoring in a recent commit to #3830. I mainly got rid of If this is an improvement, I can do the same here.. |
to receive updates from data flow library
to get updated dataflow files
yep, sounds like the right approach! I haven't had the time to look everything over yet, but this one sounds like a win 👍 |
|
I agree with RasmusWL's comment above. Splitting it out into separate classes is very reminiscent of how it's done in the JavaScript libraries, and (I think) greatly increases the readability. |
RasmusWL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Have some minor suggestions, but otherwise this looks like a good first start 🎉
Some thoughts after reading through the PR:
getEnclosingCallabledoesn't really fit the way Python programs are executed, since you can execute things in the module scope. I guess the most straightforward way of shoehorning Python into the existing framework would be to define a "fake" callable for the module scope.ReturnKindmight be useful for differentiating between the return values of normal functions, generators (using yield), and async function returns.
What happens with named arguments? What does C# do?
In C# using named arguments is always interchangeable with using positional arguments, so the situation is not quite the same. Notice that in Python these are called keyword arguments (as opposed to named arguments in C#). I have been digging into our handling of keyword arguments recently, so would be happy to chat about this 👍
python/ql/src/experimental/dataflow/internal/TaintTrackingPrivate.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/dataflow/internal/DataFlowPublic.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/dataflow/internal/DataFlowPublic.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/dataflow/internal/DataFlowPublic.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/dataflow/internal/DataFlowPublic.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
…ate.qll Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
RasmusWL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Looks good to me now 👍 (but let's get approval from Taus as well before merging)
The part about documentation for toString actually applied to multiple member-predicate definitions, I guess I should have called that out 😅
Lastly I have to admit that I didn't look too closely on the tests and expected output, since it seemed quite cumbersome. I realize I forgot to mention this, and if there is anything in particular you want me to look at, let me know :) Would be great if we could get some way to annotate the test code so it's easy to spot in the .expected files if anything is not working as we expect it to.
|
Ah, I thought I saw another toString and got worried that I had not pulled the merged suggestion, but then I could not find it again :-) The tests should be annotated, yes, and some are perhaps more debug than tests at the moment. It might be nice to have tests for all the classes of nodes in the interface, then we can quickly see if we change the meaning of those.. |
|
I am happy to merge the PR as it is. Regarding Regarding test output, I also didn't study it in detail, but did a bit of random sampling here and there. I think improving the test setup for this should be a fairly high-priority item. |
Pull the identical files into
codeql/python/ql/src/semmle/code/python/dataflow, note that we do not currently have a directory calledcodeundersemmle, but the other users of the shared libraries use this location.Implement enough stubs to have all files compile while keeping the identical files identical.
Record the Python components in
codeql/config/identical-files.json.Fill in the stubs with references to the
PointsTolibraryTest that we get desired data flows
Switch to new CFG implementation based on type tracking
Test that we get desired data flows